AITopics | expert module

Collaborating Authors

expert module

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

92bf5e6240737e0326ea59846a83e076-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 23:00:32 GMT

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Energy (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)

Add feedback

PRISM: Periodic Representation with multIscale and Similarity graph Modelling for enhanced crystal structure property prediction

Solé, Àlex, Mosella-Montoro, Albert, Cardona, Joan, Aravena, Daniel, Gómez-Coca, Silvia, Ruiz, Eliseo, Ruiz-Hidalgo, Javier

arXiv.org Artificial IntelligenceNov-26-2025

Crystal structures are characterised by repeating atomic patterns within unit cells across three-dimensional space, posing unique challenges for graph-based representation learning. Current methods often overlook essential periodic boundary conditions and multiscale interactions inherent to crystalline structures. In this paper, we introduce PRISM, a graph neural network framework that explicitly integrates multiscale representations and periodic feature encoding by employing a set of expert modules, each specialised in encoding distinct structural and chemical aspects of periodic systems. Extensive experiments across crystal structure-based benchmarks demonstrate that PRISM improves state-of-the-art predictive accuracy, significantly enhancing crystal property prediction.

artificial intelligence, atom, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2511.20362

Genre: Research Report (0.50)

Industry: Energy (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Access Controls Will Solve the Dual-Use Dilemma

Wybitul, Evžen

arXiv.org Artificial IntelligenceNov-26-2025

AI safety systems face the dual-use dilemma. It is unclear whether to answer dual-use requests, since the same query could be either harmless or harmful depending on who made it and why. To make better decisions, such systems would need to examine requests' real-world context, but currently, they lack access to this information. Instead, they sometimes end up making arbitrary choices that result in refusing legitimate queries and allowing harmful ones, which hurts both utility and safety. To address this, we propose a conceptual framework based on access controls where only verified users can access dual-use outputs. We describe the framework's components, analyse its feasibility, and explain how it addresses both over-refusals and under-refusals. While only a high-level proposal, our work takes the first step toward giving model providers more granular tools for managing dual-use content. Such tools would enable users to access more capabilities without sacrificing safety, and offer regulators new options for targeted policies.

category, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2505.09341

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Dynamic Adaptive Shared Experts with Grouped Multi-Head Attention Mixture of Experts

Li, Cheng, Liu, Jiexiong, Chen, Yixuan, ji, Jie

arXiv.org Artificial IntelligenceSep-16-2025

Transformer models based on the Mixture of Experts (MoE) architecture have made significant progress in long-sequence modeling, but existing models still have shortcomings in computational efficiency and the ability to capture long-range dependencies, especially in terms of the dynamic adaptability of expert resource allocation. In this paper, we propose a Dynamic Adaptive Shared Expert and Grouped Multi-Head Attention Hybrid Model (DASG-MoE) to enhance long-sequence modeling capabilities by integrating three modules. First, we employ the Grouped Multi-Head Attention (GMHA) mechanism to effectively reduce the computational complexity of long sequences. By parallel processing through sequence grouping, local sliding window attention, and feature aggregation, we address long-range dependency issues and the model's lack of generalization for local information. Second, we design a Dual-Scale Shared Expert Structure (DSSE), where shallow experts use lightweight computations to quickly respond to low-dimensional features, while deep experts process high-dimensional complex semantics through pre-training transfer and post-training optimization, achieving a dynamic balance between efficiency and accuracy. Third, we propose a hierarchical Adaptive Dynamic Routing (ADR) mechanism that dynamically selects expert levels based on feature complexity and task requirements, and optimizes resource allocation through a local expert activation strategy. Experiments on multiple long-sequence benchmark datasets demonstrate that our DASG-MoE model outperforms state-of-the-art models.

arxiv preprint arxiv, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2509.1053

Genre: Research Report > Promising Solution (0.34)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hash Layers For Large Sparse Models

Neural Information Processing SystemsAug-22-2025, 00:43:05 GMT

A key component to a MoE model is the routing (gating) strategy.

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Energy (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)

Add feedback

A Multi-Expert Structural-Semantic Hybrid Framework for Unveiling Historical Patterns in Temporal Knowledge Graphs

Deng, Yimin, Wu, Yuxia, Wang, Yejing, Zhao, Guoshuai, Zhu, Li, Liu, Qidong, Xu, Derong, Fu, Zichuan, Wu, Xian, Zheng, Yefeng, Zhao, Xiangyu, Qian, Xueming

arXiv.org Artificial IntelligenceJun-18-2025

Temporal knowledge graph reasoning aims to predict future events with knowledge of existing facts and plays a key role in various downstream tasks. Previous methods focused on either graph structure learning or semantic reasoning, failing to integrate dual reasoning perspectives to handle different prediction scenarios. Moreover, they lack the capability to capture the inherent differences between historical and non-historical events, which limits their generalization across different temporal contexts. To this end, we propose a Multi-Expert Structural-Semantic Hybrid (MESH) framework that employs three kinds of expert modules to integrate both structural and semantic information, guiding the reasoning process for different events. Extensive experiments on three datasets demonstrate the effectiveness of our approach.

large language model, machine learning, temporal reasoning, (19 more...)

arXiv.org Artificial Intelligence

2506.14235

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.64)
Information Technology > Artificial Intelligence > Representation & Reasoning > Temporal Reasoning (0.63)

Add feedback

Flexible and Effective Mixing of Large Language Models into a Mixture of Domain Experts

Lee, Rhui Dih, Wynter, Laura, Ganti, Raghu Kiran

arXiv.org Artificial IntelligenceSep-10-2024

We present a toolkit for creating low-cost Mixture-of-Domain-Experts (MOE) from trained models. The toolkit can be used for creating a mixture from models or from adapters. We perform extensive tests and offer guidance on defining the architecture of the resulting MOE using the toolkit. A public repository is available.

expert model, moe, router, (17 more...)

arXiv.org Artificial Intelligence

2408.1728

Country:

Asia > Singapore (0.04)
North America > United States (0.04)
Europe > Middle East > Malta > Eastern Region > Northern Harbour District > St. Julian's (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Mixture of Experts based Multi-task Supervise Learning from Crowds

Han, Tao, Shi, Huaixuan, Ding, Xinyi, Ma, Xiao, Gu, Huamao, Fang, Yili

arXiv.org Artificial IntelligenceJul-18-2024

Existing truth inference methods in crowdsourcing aim to map redundant labels and items to the ground truth. They treat the ground truth as hidden variables and use statistical or deep learning-based worker behavior models to infer the ground truth. However, worker behavior models that rely on ground truth hidden variables overlook workers' behavior at the item feature level, leading to imprecise characterizations and negatively impacting the quality of truth inference. This paper proposes a new paradigm of multi-task supervised learning from crowds, which eliminates the need for modeling of items's ground truth in worker behavior models. Within this paradigm, we propose a worker behavior model at the item feature level called Mixture of Experts based Multi-task Supervised Learning from Crowds (MMLC). Two truth inference strategies are proposed within MMLC. The first strategy, named MMLC-owf, utilizes clustering methods in the worker spectral space to identify the projection vector of the oracle worker. Subsequently, the labels generated based on this vector are considered as the inferred truth. The second strategy, called MMLC-df, employs the MMLC model to fill the crowdsourced data, which can enhance the effectiveness of existing truth inference methods. Experimental results demonstrate that MMLC-owf outperforms state-of-the-art methods and MMLC-df enhances the quality of existing truth inference methods.

dataset, ground truth, worker behavior model, (11 more...)

arXiv.org Artificial Intelligence

2407.13268

Country: Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework

Zhang, Zijian, Liu, Shuchang, Yu, Jiaao, Cai, Qingpeng, Zhao, Xiangyu, Zhang, Chunxu, Liu, Ziru, Liu, Qidong, Zhao, Hongwei, Hu, Lantao, Jiang, Peng, Gai, Kun

arXiv.org Artificial IntelligenceMay-12-2024

Multi-domain recommendation and multi-task recommendation have demonstrated their effectiveness in leveraging common information from different domains and objectives for comprehensive user modeling. Nonetheless, the practical recommendation usually faces multiple domains and tasks simultaneously, which cannot be well-addressed by current methods. To this end, we introduce M3oE, an adaptive Multi-domain Multi-task Mixture-of-Experts recommendation framework. M3oE integrates multi-domain information, maps knowledge across domains and tasks, and optimizes multiple objectives. We leverage three mixture-of-experts modules to learn common, domain-aspect, and task-aspect user preferences respectively to address the complex dependencies among multiple domains and tasks in a disentangled manner. Additionally, we design a two-level fusion mechanism for precise control over feature extraction and fusion across diverse domains and tasks. The framework's adaptability is further enhanced by applying AutoML technique, which allows dynamic structure optimization. To the best of the authors' knowledge, our M3oE is the first effort to solve multi-domain multi-task recommendation self-adaptively. Extensive experiments on two benchmark datasets against diverse baselines demonstrate M3oE's superior performance. The implementation code is available to ensure reproducibility.

information, knowledge management, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3626772.3657686

2404.18465

Country:

Asia > China > Jilin Province (0.14)
Asia > China > Hong Kong (0.05)
Asia > China > Beijing > Beijing (0.05)
(3 more...)

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Knowledge Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Cantor: Inspiring Multimodal Chain-of-Thought of MLLM

Gao, Timin, Chen, Peixian, Zhang, Mengdan, Fu, Chaoyou, Shen, Yunhang, Zhang, Yan, Zhang, Shengchuan, Zheng, Xiawu, Sun, Xing, Cao, Liujuan, Ji, Rongrong

arXiv.org Artificial IntelligenceApr-24-2024

With the advent of large language models(LLMs) enhanced by the chain-of-thought(CoT) methodology, visual reasoning problem is usually decomposed into manageable sub-tasks and tackled sequentially with various external tools. However, such a paradigm faces the challenge of the potential "determining hallucinations" in decision-making due to insufficient visual information and the limitation of low-level perception tools that fail to provide abstract summaries necessary for comprehensive reasoning. We argue that converging visual context acquisition and logical reasoning is pivotal for tackling visual reasoning tasks. This paper delves into the realm of multimodal CoT to solve intricate visual reasoning tasks with multimodal large language models(MLLMs) and their cognitive capability. To this end, we propose an innovative multimodal CoT framework, termed Cantor, characterized by a perception-decision architecture. Cantor first acts as a decision generator and integrates visual inputs to analyze the image and problem, ensuring a closer alignment with the actual context. Furthermore, Cantor leverages the advanced cognitive functions of MLLMs to perform as multifaceted experts for deriving higher-level information, enhancing the CoT generation process. Our extensive experiments demonstrate the efficacy of the proposed framework, showing significant improvements in multimodal CoT performance across two complex visual reasoning datasets, without necessitating fine-tuning or ground-truth rationales. Project Page: https://ggg0919.github.io/cantor/ .

cantor, information, visual information, (13 more...)

arXiv.org Artificial Intelligence

2404.16033

Country:

Oceania > New Zealand (0.05)
Oceania > Australia (0.05)
South America > Ecuador (0.04)
(9 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.34)
Health & Medicine > Therapeutic Area > Neurology (0.34)
Health & Medicine > Consumer Health (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback